102 research outputs found

    A tree-based kernel for graphs with continuous attributes

    Full text link
    The availability of graph data with node attributes that can be either discrete or real-valued is constantly increasing. While existing kernel methods are effective techniques for dealing with graphs having discrete node labels, their adaptation to non-discrete or continuous node attributes has been limited, mainly for computational issues. Recently, a few kernels especially tailored for this domain, and that trade predictive performance for computational efficiency, have been proposed. In this paper, we propose a graph kernel for complex and continuous nodes' attributes, whose features are tree structures extracted from specific graph visits. The kernel manages to keep the same complexity of state-of-the-art kernels while implicitly using a larger feature space. We further present an approximated variant of the kernel which reduces its complexity significantly. Experimental results obtained on six real-world datasets show that the kernel is the best performing one on most of them. Moreover, in most cases the approximated version reaches comparable performances to current state-of-the-art kernels in terms of classification accuracy while greatly shortening the running times.Comment: This work has been submitted to the IEEE Transactions on Neural Networks and Learning Systems for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    An Empirical Study on Budget-Aware Online Kernel Algorithms for Streams of Graphs

    Full text link
    Kernel methods are considered an effective technique for on-line learning. Many approaches have been developed for compactly representing the dual solution of a kernel method when the problem imposes memory constraints. However, in literature no work is specifically tailored to streams of graphs. Motivated by the fact that the size of the feature space representation of many state-of-the-art graph kernels is relatively small and thus it is explicitly computable, we study whether executing kernel algorithms in the feature space can be more effective than the classical dual approach. We study three different algorithms and various strategies for managing the budget. Efficiency and efficacy of the proposed approaches are experimentally assessed on relatively large graph streams exhibiting concept drift. It turns out that, when strict memory budget constraints have to be enforced, working in feature space, given the current state of the art on graph kernels, is more than a viable alternative to dual approaches, both in terms of speed and classification performance.Comment: Author's version of the manuscript, to appear in Neurocomputing (ELSEVIER

    Ordinal text quantification

    Get PDF

    Kernel Methods for Tree Structured Data

    Get PDF
    Machine learning comprises a series of techniques for automatic extraction of meaningful information from large collections of noisy data. In many real world applications, data is naturally represented in structured form. Since traditional methods in machine learning deal with vectorial information, they require an a priori form of preprocessing. Among all the learning techniques for dealing with structured data, kernel methods are recognized to have a strong theoretical background and to be effective approaches. They do not require an explicit vectorial representation of the data in terms of features, but rely on a measure of similarity between any pair of objects of a domain, the kernel function. Designing fast and good kernel functions is a challenging problem. In the case of tree structured data two issues become relevant: kernel for trees should not be sparse and should be fast to compute. The sparsity problem arises when, given a dataset and a kernel function, most structures of the dataset are completely dissimilar to one another. In those cases the classifier has too few information for making correct predictions on unseen data. In fact, it tends to produce a discriminating function behaving as the nearest neighbour rule. Sparsity is likely to arise for some standard tree kernel functions, such as the subtree and subset tree kernel, when they are applied to datasets with node labels belonging to a large domain. A second drawback of using tree kernels is the time complexity required both in learning and classification phases. Such a complexity can sometimes prevents the kernel application in scenarios involving large amount of data. This thesis proposes three contributions for resolving the above issues of kernel for trees. A first contribution aims at creating kernel functions which adapt to the statistical properties of the dataset, thus reducing its sparsity with respect to traditional tree kernel functions. Specifically, we propose to encode the input trees by an algorithm able to project the data onto a lower dimensional space with the property that similar structures are mapped similarly. By building kernel functions on the lower dimensional representation, we are able to perform inexact matchings between different inputs in the original space. A second contribution is the proposal of a novel kernel function based on the convolution kernel framework. Convolution kernel measures the similarity of two objects in terms of the similarities of their subparts. Most convolution kernels are based on counting the number of shared substructures, partially discarding information about their position in the original structure. The kernel function we propose is, instead, especially focused on this aspect. A third contribution is devoted at reducing the computational burden related to the calculation of a kernel function between a tree and a forest of trees, which is a typical operation in the classification phase and, for some algorithms, also in the learning phase. We propose a general methodology applicable to convolution kernels. Moreover, we show an instantiation of our technique when kernels such as the subtree and subset tree kernels are employed. In those cases, Direct Acyclic Graphs can be used to compactly represent shared substructures in different trees, thus reducing the computational burden and storage requirements

    Findings of the NLP4IF-2019 Shared Task on Fine-Grained Propaganda Detection

    Full text link
    We present the shared task on Fine-Grained Propaganda Detection, which was organized as part of the NLP4IF workshop at EMNLP-IJCNLP 2019. There were two subtasks. FLC is a fragment-level task that asks for the identification of propagandist text fragments in a news article and also for the prediction of the specific propaganda technique used in each such fragment (18-way classification task). SLC is a sentence-level binary classification task asking to detect the sentences that contain propaganda. A total of 12 teams submitted systems for the FLC task, 25 teams did so for the SLC task, and 14 teams eventually submitted a system description paper. For both subtasks, most systems managed to beat the baseline by a sizable margin. The leaderboard and the data from the competition are available at http://propaganda.qcri.org/nlp4if-shared-task/.Comment: propaganda, disinformation, fake news. arXiv admin note: text overlap with arXiv:1910.0251

    That is a Known Lie: Detecting Previously Fact-Checked Claims

    Full text link
    The recent proliferation of "fake news" has triggered a number of responses, most notably the emergence of several manual fact-checking initiatives. As a result and over time, a large number of fact-checked claims have been accumulated, which increases the likelihood that a new claim in social media or a new statement by a politician might have already been fact-checked by some trusted fact-checking organization, as viral claims often come back after a while in social media, and politicians like to repeat their favorite statements, true or false, over and over again. As manual fact-checking is very time-consuming (and fully automatic fact-checking has credibility issues), it is important to try to save this effort and to avoid wasting time on claims that have already been fact-checked. Interestingly, despite the importance of the task, it has been largely ignored by the research community so far. Here, we aim to bridge this gap. In particular, we formulate the task and we discuss how it relates to, but also differs from, previous work. We further create a specialized dataset, which we release to the research community. Finally, we present learning-to-rank experiments that demonstrate sizable improvements over state-of-the-art retrieval and textual similarity approaches.Comment: detecting previously fact-checked claims, fact-checking, disinformation, fake news, social media, political debate

    Efficient Online Learning for Mapping Kernels on Linguistic Structures

    Get PDF
    Kernel methods are popular and effective techniques for learn- ing on structured data, such as trees and graphs. One of their major drawbacks is the computational cost related to making a prediction on an example, which manifests in the classifica- tion phase for batch kernel methods, and especially in online learning algorithms. In this paper, we analyze how to speed up the prediction when the kernel function is an instance of the Mapping Kernels, a general framework for specifying ker- nels for structured data which extends the popular convolution kernel framework. We theoretically study the general model, derive various optimization strategies and show how to apply them to popular kernels for structured data. Additionally, we derive a reliable empirical evidence on semantic role labeling task, which is a natural language classification task, highly dependent on syntactic trees. The results show that our faster approach can clearly improve on standard kernel-based SVMs, which cannot run on very large datasets
    • …
    corecore